t248: Fix deploying->deployed transition stuck state#1018
Conversation
Improvements: 1. Reduced Phase 4d timeout from 600s (10min) to 120s (2min) for faster recovery 2. Added retry logic with exponential backoff (3 attempts: 0s, 1s, 3s) to both: - Initial deploying->deployed transition after merge (Step 4) - Auto-recovery transition in Phase 4b 3. Added better error logging to capture transition failure reasons 4. Tasks now recover within 2-4 minutes instead of 10-13 minutes Root cause: - cmd_transition failures were logged but not retried - Silent failures left tasks stuck until Phase 4d timeout (10min) - DB write errors or process interruptions during transition Evidence from this session: - t241: 13min delay (19:18:35 -> 19:31:33) - t240: 2.5min delay (19:31:44 -> 19:34:09) - t242: 1.5min delay (19:32:37 -> 19:34:09) Files modified: - .agents/scripts/supervisor-helper.sh (lines 8385-8420, 8404-8465, 9051-9057)
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Tue Feb 10 23:15:30 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|



Summary
Fixes the deploying->deployed transition stuck state issue where tasks would remain in 'deploying' status for 10-13 minutes before auto-recovery.
Changes
Root Cause
cmd_transitionfailures were logged with|| log_warnbut not retriedEvidence from This Session
Testing
bash -npassedFiles Modified
.agents/scripts/supervisor-helper.sh(lines 8385-8420, 8404-8465, 9051-9057)Closes #248